Thomson Reuters at TAC 2009: ContextChain and Fractional Conditional Compressibility of Models
نویسندگان
چکیده
This paper contains the results for the FastSum system and a simple baseline system for the TAC 2009 main task – update summarization –. For the pilot task of Automatically Evaluating Summaries of Peers (AESOP), we present two novel metrics. The first metric called ContextChain is an extension of a recently proposed metric AutoSummENG that is based on comparing n-gram graphs of the model summaries and the automatically generated summaries. Our modification of the generated n-gram graphs is based on co-reference chains extracted from the summaries. The ngram graph is then generated from the context information of these referents. Our second metric called Fractional Conditional Compressibility of Models (FraCC) is based on the BurrowsWheeler compression algorithm. For this evaluation metric, we use an estimate of the conditional “compressibility” of the model summaries given the system summary. The conditional compressibility is defined as the increase in the compressibility of the model summary when the system summary has been observed. In addition to presenting our two new approaches to automatically evaluating summaries, we will introduce two new evaluation measures for automatic metrics called Correlation Recall and Correlation Precision and discuss how they can cast more light on the coverage and the correctness of the evaluation metrics for summarization.
منابع مشابه
Benchmarks for Enterprise Linking: Thomson Reuters R&D at TAC 2013
This paper describes the TRRD systems entered in the TAC 2013 entity linking challenge. We explore a restricted version of the task that accesses only an entity authority file with (possibly noisy) alternative names and plain text from the target domain. This is designed to reflect the problem of linking to existing entity authorities within companies like Thomson Reuters. We used the 2013 shar...
متن کاملThomson Reuters at TAC 2008: Aggressive Filtering with FastSum for Update and Opinion Summarization
In TAC 2008 we participated in the main task (Update Summarization) as well as the Sentiment Summarization pilot task. We modified the FastSum system (Schilder and Kondadadi, 2008) and added more aggressive filtering in order to adapt the system to update summarization and sentiment summarization. For the Update Summarization task, we show that a classifier that identifies sentences that are si...
متن کاملInformation access in practice: case studies at Thomson Reuters
Isabelle Moulinier is a research scientist at Thomson Reuters corporate R&D group. Since joining the group 15 years ago, her research interests have focuses on the application of information retrieval, natural language processing and machine learning technologies to the improvement of search and other aspects of the user experience. Prior to joining Thomson Reuters, she worked on text categoriz...
متن کاملTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
This paper presents our extractive summarization systems at the update summarization track of TAC 2009. This system is based on our newly developed document summarization framework under the theory of conditional information distance among many objects. The best summary is defined in this paper to be the one which has the minimum information distance to the entire document set. The best update ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009